Multimodal Speaker Identification using Adaptive Decision Fusion with Reliability Weighted Summation

نویسندگان

Engin Erzin

Yücel Yemez

A. Murat Tekalp

چکیده

We present a multimodal open-set speaker identification system that integrates information coming from audio, face and lip motion modalities. For fusion of multiple modalities, the so called product rule with a novel adaptive reliability based weighting structure is employed. The proposed adaptive product rule is more robust in the presence of unreliable modalities, provided that the employed reliability measure is effective in assessment of classifier decisions. The proposed reliability measure, that genuinely fits to the open-set speaker identification problem, is used to assess more robust accept and reject decisions. Experimental results that support this assertion are provided.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adaptive classifier cascade for multimodal speaker identification

We present a multimodal open-set speaker identification system that integrates information coming from audio, face and lip motion modalities. For fusion of multiple modalities, we propose a new adaptive cascade rule that favors reliable modality combinations through a cascade of classifiers. The order of the classifiers in the cascade is adaptively determined based on the reliability of each mo...

متن کامل

Multimodal speaker/speech recognition using lip motion, lip texture and audio

We present a new multimodal speaker/speech recognition system that integrates audio, lip texture and lip motion modalities. Fusion of audio and face texture modalities has been investigated in the literature before. The emphasis of this work is to investigate the benefits of inclusion of lip motion modality for two distinct cases: speaker and speech recognition. The audio modality is represente...

متن کامل

Discrimination Analysis of Lip Motion Features for Multimodal Speaker Identification and Speech-reading

In this thesis a new multimodal speaker/speech recognition system that integrates audio, lip texture, lip geometry, and lip motion modalities is presented. There have been several studies that jointly use audio, lip intensity and/or lip geometry information for speaker identification and speech-reading applications. This work proposes using explicit lip motion information, instead of or in addi...

متن کامل

Chapter 16 JOINT AUDIO - VIDEO PROCESSING FOR ROBUST BIOMETRIC SPEAKER IDENTIFICATION IN CAR 1

In this chapter, we present our recent results on the multilevel Bayesian decision fusion scheme for multimodal audio-visual speaker identification problem. The objective is to improve the recognition performance over conventional decision fusion schemes. The proposed system decomposes the information existing in a video stream into three components: speech, lip trace and face texture. Lip trac...

متن کامل

Fuzzy logic decision fusion in a multimodal biometric system

This paper presents a multi-biometric verification system that combines speaker verification, fingerprint verification with face identification. Their respective equal error rates (EER) are 4.3%, 5.1% and the range of (5.1% to 11.5%) for matched conditions in facial image capture. Fusion of the three by majority voting gave a relative improvement of 48% over speaker verification (i.e. the best-...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

Multimodal Speaker Identification using Adaptive Decision Fusion with Reliability Weighted Summation

نویسندگان

چکیده

منابع مشابه

Adaptive classifier cascade for multimodal speaker identification

Multimodal speaker/speech recognition using lip motion, lip texture and audio

Discrimination Analysis of Lip Motion Features for Multimodal Speaker Identification and Speech-reading

Chapter 16 JOINT AUDIO - VIDEO PROCESSING FOR ROBUST BIOMETRIC SPEAKER IDENTIFICATION IN CAR 1

Fuzzy logic decision fusion in a multimodal biometric system

عنوان ژورنال:

اشتراک گذاری